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Abstract    '  *i 

It  is  suggested  that  the  economics  of  present  large-scale  scientific 
computers  could  benefit  from  a.  greater  investment  in  hardware  to  mechanize 
multiplication  and  division  than  is  now  common.  As  a  move  in  this  direction 
a  design  is  developed  for  a  multiplier  which  generates  the  product  of  two 
40-digit  numbers  using  purely  combinational  logic,  i.e.,  in  one  gating 
step.   This  design  is  described  in  some  detail  to  establish  that  no  ex- 
ceptional cases  invalidate  the  assertion  made  about  its  speed  of  operation 
Using  straightforward  diode-transistor  logic,  it  appears  presently  possible 
to  obtain  products  in  under  one  microsecond,  and  quotients  in  three.   A 
rapid  square  root  process  is  also  outlined.   Approximate  component  counts 
are  given  for  the  proposed  design. 


I.    Introduction 

A  contemporary  computer  spends  a.  large  percentage  of  its  time 
executing  multiplication,  and  to  a  lesser  extent,  division.   The  recent 
advent  in  very  large  machines  of  bookkeeping  controls -operating  in  advance 
of  the  arithmetic  unit  to  execute  memory  fetches,  stores  and  address  mod- 
ification, etc. -has  tended  to  increase  this  percentage  by  relieving  the 
arithmetic  unit  of  many  trivial  burdens.   The  arithmetic  unit  of  such  a 
machine,  when  used  for  scientific  computations,  will  spend  nearly  half 
its  time  multiplying  or  dividing.   Paradoxically,  the  amount  of  hardware 
built  into  large  machines  specifically  for  these  operations  is  rarely  very 
great.   Thus  the  situation  has  arisen,  viewed  in  the  context  of  a  very  large 
machine  involving  a  heavy  investment  in  memory,  peripheral  equipment  and 
controls,  that  it  may  be  advantageous  to  the  economy  of  the  machine  as  a  whole 
to  increase  the  hardware  investment  in  the  operations  of  multiplication  and 
division,  even  beyond  the  point  where  an  increment  of  this  investment  yields 
an  equal  incremental  increase  in  multiplication-division  speed.   Consistent 
with  this  point  of  view,  this  paper  will  describe  the  logical  design  and 
economics  of  a  multiply-divide  unit  designed  for  maximum  possible  speed. 

For  multiplication,  which  will  be  discussed  first,  obvious  ways 
to  get  high  speed  are  to  (a)  reduce  the  number  of  partial  products  to  be 
summed,  and  (b)  to  extend  the  parallelism  used  in  their  addition.   The 
limiting  case  of  the  latter  course  in  which  the  product  is  formed  by 
combinatorial  logic  in  one  gating  step  is  treated  below. 
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This  approach,  while  clearly  involving  a  great  deal  of  hardware, 
has  some  by-product  advantages.   First,  control  complexity  is  reduced  to 
the  minimum  of  a  single  step.   Second,  with  present  transistor  technology 
the  time  for  the  distribution  of  gate  signals  to  a  flipflop  register 
augmented  by  the  time  required  for  the  flipflops  to  settle  into  their  new 
state  generally  exceeds  by  a  considerable  factor  the  propagation  delay 
through  a  combinatorial  logic  element.   Thus  there  is  a  strong  argument 
toward  performing  many  levels  of  logic  in  each  gating  step. 

In  this  paper  attention  will  be  restricted  to  the  multiplication 
and  division  of  ^O-digit  two's  complement  binary  numbers. 


II.    The  Adder  Tree 

Given  a  large  number  of  numbers  to  be  summed  by  combinatorial 
logic,  it  is  clearly  unnecessary  and  undesirable  to  ha.ve  carry  propagation 
at  each  intermediate  stage  of  the  additions.   A  straightforward  approach, 
used  here,  employs  a  sufficient  number  of  full-adder  words,  each  consisting 
of  as  many  full-adder  circuits  as  there  are  significant  digits  in  the 
numbers  to  be  added.   The  full-adder  circuits  are  not  interconnected  in 
any  way.   A  full-adder  word  gives  two  output  numbers,  sum  and  carry,  whose 
sum  equals  the  sum  of  the  three  input  numbers,   If  there  are  n  numbers  t« 
be  summed,  n  -  2  full-adder  words  will  be  needed  to  express  the  sum  as  ti 
numbers.   These  two  numbers  must  then  be  summed  in  a  carry -propagating  adder 
to  produce  the  final  result. 


:.o 
:wo 


All  partial  products  to  be  summed  are  generated  simultaneously. 
An  arrangement  of  the  n  -  2  full -adder  words  to  start  work  on  all  partial 
products  simultaneously,  and  to  produce  the  result  after  as  few  full- 
adder  propagation  delays  as  possible  is  desired.   This  suggests  a  tree 
structure  of  the  type  shown  in  Figure  1.   In  this  figure  each  box  represents 
a  full-adder  word,  the  three  incoming  numbers  identified  at  the  top.   The 
sum  and  carry  numbers  lea,ving  the  bottom  of  the  box  are  identified  by 
the  letters  s  and  c. 

As  can  be  seen,  starting  with  the  carry -propagating  adder,  each 
additional  level  of  full-adder  words  increases  the  number  of  available 
inputs  by  a  factor  of  1 . 5  or  less.   The  inputs  shown  by  w  ,  W  ,  etc., 
to  W^  are  the  partial  product  numbers.   The  example  shown  in  the  figure, 
with  twenty  input  numbers,  corresponds  to  the  particular  multiplier 
design  to  be  developed  below. 
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Figure  1.   Adder  Tree 
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Certain  complications  arise  from  the  fact  that  the  summands,  being 
partial  products,  are  shifted  relative  to  one  another.   Thus  the  three  input 
numbers  to  any  one  adder  word  do  not  in  general  cover  the  same  range  of 
digital  positions.   At  the  less  significant  end  of  the  adder  word,  there  will 
be  digital  positions  having  only  one  or  two  inputs.   Since  the  function  of  an 
adder  word  as  to  reduce  three  input  numbers  to  two  output  numbers,  these 
digital  positions  of  the  adder  word  need  not  contain  full  adder  circuits.   In 
some  cases,  they  need  contain  only  one  or  two  inverter  circuits.   Unfortunately, 
the  same  simplication  does  not  apply  at  the  more  significant  end.   Each 
partial  product,  and  hence  in  general  any  input  number  to  an  adder  word,  may 
be  negative.   In  the  two's  complement  representation,  a  number  to  be  added  in 
an  adder  word  to  a  number  of  greater  significance  must  be  augmented  to  the 
left  of  its  sign  digit  with  copies  of  its  sign  digit.   Thus,  the  adder  word 
must  contain  full  adder  circuits  as  far  left  as  the  most  significant  (i.e.,  sign) 
digit  of  its  most  significant  input  number  if  all  input  numbers  may  be  negative. 
Also,  the  most  significant  full  adder  circuit  of  an  adder  word  whose  outputs, 
by  virtue  of  entering  another  adder  word  together  with  an  input  number  of 
greater  significance,  must  be  augmented  by  copies  of  their  sign  digits,  each 
capable  of  driving  several  full  adder  circuit  inputs.   However,  in  such 
cases,  it  is  possible  to  arrange  that  only  the  carry  output  number  need  be  so 
augmented,  the  sum  output  being  restricted  to  positive  values. 

As  will  be  shown  when  the  circuitry  proposed  for  the  full  adder 
is  described,  carry  output  may  be  provided  with  extra  fanout  without  overall 
loss  of  speed  (see  Figure  2).   If  the  three  sign  digit  inputs  to  the  most 
significant  adder  stage  of  an  adder  word  are  denoted  x,  y  and  z,  the  adder 
stage  may  take  the  normal  form,  giving  two  outputs 

C  =  Xy  V  yZ  v  zx,  s  =  (x  ^  y  v  Z)  •  c  v-  xyz 

where,  of  course,  the  digit  c  enters  the  next  adder  word  displaced  one  digital 
position  to  the  left,  and  provided  that  both  s  and  c  are  augmented  by  copies 
are  far  left  as  is  necessary.   This  form  will  be  used  only  where  the  s  output 
does  not  in  fact  require  augmenting,  as  provision  for  the  extra  fanout  would 
slow  the  addition.   To  ensure  that  the  sum  output  word  is  always  positive, 
use  will  be  made  of  the  fact,  true  in  all  cases  where  this  maneuver  is  re- 
quired, that  the  three  input  numbers  are  not  of  equal  significance.   Two  of 
the  numbers  will  have  identical  digits  in  both  the  sign  and  next  less  sig- 
nificant digital  positions.   Thus  the  left  hand  two  adders  of  the  adder  word 
must  sum  three  numbers  having  the  form  at  their  left-hand  ends 
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Figure  2.   A  Full  Adder  Circuit 
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q  is  developed  as  the  sum  modulo  two  of  y,  a  and  b  in  the  usual  way, 
The  logical  expressions  for  the  remaining  digits  are: 

r  =  x  ^   a  ^  b 

t  =  (a  v  b)  •  y  •  (ab) 


p  =  r  ■  [x  ■  (a  v  b)] 

The  only  digit  requiring  fanout  in  any  circumstances  will  be  r,  which  can  be 
amplified  without  loss  of  speed  (see  Figure  3). 

When  this  form  of  sum  and  carry  outputs  from  one  adder  stage  both  enter 
the  same  adder  word  in  the  next  stage  together  with  a  third  input  of  greater 
significance,  the  left  hand  end  of  the  second  adder  word  will  have  stages 
with  only  two  significant  input  digits.   In  this  case,  the  circuits  used  in 
these  stages  can  be  half -adders . 

At  least  one  adder  word  in  each  level  of  the  adder  tree  will  have 
at  i  t  s  more  significant  end  adder  circuits  handling  digits  of  the  weight 
of  the  sign  digit  of  the  final  product.   Since  two's  complement  representation 
is  proposed,  carry  outputs  from  these  circuits  may  be  ignored.   No  adder  word 
will  contain  digital  positions  to  the  left  of  the  sign  digit  of  the  final 
product „ 

At  the  other  end  of  the  adder  words,,  each  level  of  the  adder  tree 
will  contain  an  adder  word  some  of  whose  least  significant  output  digits  have 
less  significance  than  any  output  digits  of  any  other  adder  word  in  the  same 
level.   These  digits  may  bypass  all  remaining  levels  of  the  tree  and  enter 
directly  into  the  (double-length)  carry  propagating  adder.   Thus,  at  the  time 
when  adder  word  one  produces  its  output  numbers,  these  will  not  contain  the 
less  significant  end  of  the  product,  which  will  have  already  been  produced 
in  its  final  form  by  the  right-hand  end  of  the  carry  propagating  adder. 
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Figure  3« 


Possible  circuit  for  the  two  most  significant  stages  of  an 
adder  word  giving  a  non-negative  sum  output.   Biasing  resistors 
and  clamps  not  shown. 
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Ill .   Generation  of  Partial  Products 

Each  partial  product  will  be  selected  from  a  limited  number  of 
multiples  of  the  multiplicand  on  the  basis  of  some  of  the  multiplier  digits. 
It  is  proposed  that  the  available  multiples  be  +2,  +l,  0,  -1,  and  -2  times 
the  multiplicand.   Partial  product  W.,  where  '»i»  is  odd,  will  depend  on 
multiplier  digits  x.^,  x.,  and  x±+1,  where  the  multiplier  digits  are  labelled 
from  xQ  (the  sign  digit)  to  ^   (the  least  significant).   Digit  x,_  is  taken 
as  zero.   The  rules  for  selection  of  a  multiple  are: 


^2   if  x 


X.   ■   X, 


1-1     1     1+1 
+1  if  ~ 


1-1    Xi  '   Xi+1  v  Xi-1  ■   xi  ■ 


i+1 


0   if  x 


1-1    Xi  "   Xi+1  V  Xi-1  ■  \    '      *1+1 


_1   lf   X,-  1  '        X-         '        X.  n  v  X      •   x~  •   v 

!"!       1        1+1      i-1      Xi      Xi+1 


-2  if .  x.  _  •  x.   •   x~ 

i-l     i      i+1 

This  recoding  scheme  has  the  following  advantages: 

i)    It  requires  little  logic. 

ii)   All  selections  can  be  made  simultaneously;  the 
recoding  is  not  a  serial  process. 

iii)   The  multiples  used  can  be  obtained  from  the  multiplicand 
by  the  trivial  processes  of  complementation  and 
displacement . 

iv)    It  produces  only  20  partial  products  from  the  kO- 
digit  multiplier. 


v 


It  applies  without  alteration  to  the  leftmost  digits 
of  the  multiplier. 


Alternative  schemes  involving  a  smaller  number  of  partial  products, 
each  selected  from  more  possibilities,  are  considered  Inadvisable  in  this 
-ontext.   If,  say,  eight  or  nine  multiples  are  allowed,  the  number  of  partial 
products  is  reduced  to  1+.   The  time  saving  however  is  small;  the  adder  tree, 
while  using  12  rather  than  18  adder  words,  is  shortened  by  only  one  level 


in  seven.   Such  a  recoding  scheme  would  require  multiples  not  obta.ina.ble  by  shifting  and 
'complementation  as  above.   The  generation  of  these  multiples  would  almost  certainly  re- 
quire longer  than  the  propagation  delay  of  the  one  adder  tree  level  saved.   Moreover,  it 
is  not  clear  that  any  equipment  saving  could  be  made  in  this  way,  as  the  circuits  re- 
quired for  selection  of  the  partial  products  involve  a  considerable  amount  of  equipment, 
which  increases  linearly  with  the  number  of  possible  multiples. 

A  two's  complement  number  representation  has  been  assumed.   When  a.  nega.tive 
multiple  of  the  multiplicand  is  selected,  the  complement  of  the  multiplicand  is  used, 
and  a  correction  applied  to  this  complement  by  adding  one  to  its  least -significant 
iigit.  To  add  this  correction  directly  to  the  complement  in  a  special  adder  provided  for 
the  purpose  would  be  both  time-consuming  a.nd  expensive.   Instead,  the  correction  digit 
for  some  partial  product  W.,  occurring  in  a,  digital  position  i  +  39,  can  be  appended  to 
:he  right-hand  end  of  the  next  more  significant  partial  product  W.   ,  thus  extending 
Ms  partial  product  to  the  right  from  position  i  +  37  to  i  +  39- X  A  slight  improvement 
)f  this  method  is  to  so  recode  the  last  digit  of  W.  that  the  correction  bit  will  occur 
i  position  i  +  38,  thus  extending  W^  by  one  digital  position  rather  than  two. 

Suppose  the  least-significant  digit  of  the  possibly  shifted,  but  as  yet  un- 
complemented, multiplicand  is  x.   Instead  of  setting,  for  a  negative  multiple,  digit 

39  of  W.  equal  to  x,  and  digit  i  +  39  of  W._2  equal  to  1,  we  set  digit  i  +  39  of  W. 
•qua!  to  x,  and  digit  i  +  38  of  W.    equal  to  x. 

Thus,  after  allowing  a.  possible  left  displacement  of  the  multiplicand  of  one 
jlace  for  multiples  of  modules  two,  the  range  of  digital  positions  occupied  by  significant 
igits  of  W.  is  i  -  1  to  i  +  kO. 

The  correction  to  V±   cannot  be  so  treated,  as  it  is  the  most  significant 
jartial  product.   Its  correction  digit,  which,  by  use  of  the  above  technique,  Is  made  to 
jie  in  position  39,  is  instead  added  to  the  least-significant  partial  product  W   .   This 
an  be  dene  without  loss  of  time  in  the  following  ways. 

Digits  of  W37  and  W   in  positions  to  the  left  of  and  including  position 
9   are  not  fed  directly  into  adder  word  17 .   Instead,  they  are  fed  into  a,  short  adder 
3rd  section  (number  19)  in  level  seven  of  the  adder  tree,  having  adder  stages  in 
jigital  positions  36  to  39-   This  section  also  receives,  in  position  39,  the  W 

action  digit.   The  sum  and  carry  outputs  of  this  section  cover  digital  positions  36 
39  and  32*.  to  38  respectively.   These,  together  with  positions  3^  to  39  of  W   ,  enter 
'sitions  3^  to  39  of  adder  word  17.   Since  level  seven  of  the  adder  tree  is  necessary 
|"»  any  case,  this  additional  section  does  not  delay  the  final  result. 


IV.   Dimensions  of  the  Adder  Tree 

Ha,ving  decided  upon  the  formation  of  the  partial  products,  a.nd 

the  general  scheme  for  their  addition,  one  can  now  fix  the  length  and 

relative  significance  of  the  numbers  appearing  at  the  inputs  to  the  various 

adder  words,  and  hence  the  dimensions  of  the  adder  words  themselves.   In 

the  following  list,  the  inputs  to  each  adder  word  are  listed  with  their 

ranges  of  significant  digital  positions.   Partial  product  input  numbers,  as 

modified  by  the  addition  of  correction  bits,  are  called  W. .   Sum  and  carry 

output  numbers  of  adder  word  j  are  called  s .  and  c . .   Where  the  last  few 

J      J 
output  digits  of  an  adder  are  fed  directly  to  the  carry  propagating  adder, 

this  is  shown  by  the  range  of  digital  positions  involved  and  the  word  "out." 

In  this  case,  the  digits  going  "out"  are  not  included  in  the  listed  sum  and 

carry  outputs.  Where  the  last  few  stages  of  an  adder  word  have  two  or  less 

inputs,  and  hence  do  not  involve  the  use  of  full  adder  circuits,  the  range 

of  digital  positions  is  shown  with  the  word  "void." 


Adder 
Word 


Inputs 


Outputs 


Stages   Remarks 


19 


38,  39  of  W 


39 


36-39  of  W3T 

Correction  to  Wl  at  position  39 


s:  36-39   36-39 
c:   34-38    (4) 


18 


W3  (2-43) 
W5  (4-45) 
W7  (6-1+7) 


s-:  1-47    2-43     kk-k7   void 
o:  1-45    (42) 


17  W39  (40-78),  sig  (36-39) 
W37  (40-77),  sig  (34-38) 
W35  (34-75) 


s:   34-73   34-75    74-78  out. 
c:   32-73   (42)     76-78  void 


16 


W33  (32-73) 
W31  (30-71) 
W29  (28-69) 


s:   28-73   28-69    70-73  void 
c:   26-71   (42) 


15 


W27  (26-67) 
W25  (24-65) 
W23  (22-63) 


s:   21-67   22-63 
g:  21-65   (42) 


64-67  void 
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Adder 

Word 

Inputs 

Outputs 

Stages 

Remarks 

14 

W21  (20-61) 

a:  16-61 

16-57 

58-6l  void 

W19  (18-59) 

c:  14-59 

(42) 

W17  (16-57) 

13 

W15  (14-55) 

s-:  10-55 

10-51 

52-55  void 

W13  (12-53) 

c:  8-53 

(42) 

wii  (10-51) 

12 

W9  (8-49) 

s.:  0-49 

1-45 

46-49  void 

sl8  (1-47) 

Co  0-47 

(46) 

cl8  (1.49) 

11 

sl7  (34-73) 

s:  28-71 

28-71 

72,  73  out 

cl7  (32-73) 

c:   26-71 

(44) 

sl6  (28-73) 

10 

cl6  (26-71) 

s:   21-71 

21-65 

66-71  void 

sl5  (21-67) 

c:   19-67 

(45) 

cl5  (21-65) 

9 

sl4  (16-61) 

s:   9-6l 

1.0-55 

56-6l  void 

elk   ( 14-59) 

c:  9-59 

(46) 

sl3  (10-55) 

8 

cl3  (8-53) 

s:  0-53 

0-47 

48-53  void 

sl2  (0-^9) 

c:  0-49 

(48) 

cl2  (0-^7) 
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Adder 

Word 

Inputs 

Outputs 

Stages 

Remarks 

7 

sll  (28-71) 

s:  21-67 

21-71 

68-71  out 

ell  (26-71) 

cs   19-67 

(51) 

slO  (21-71) 

6 

clO  (19-67) 

s:   9-67 

9-59 

60-67  void 

s9  (9-6l) 

c:   7-61 

(51) 

c9  (9-59) 

- 

5 

s8  (0-53) 

s:  0-53 

o-4o 

i+l-53  void 

c8  (0-49) 

c:  9-^-9 

(to) 

wi  (o-4o) 

k 

s7  (21-67) 

s:  9-61 

9-61 

62-67  out 

c7  (19-67) 

c:   7-6l 

(53) 

s6  (9-67) 

3 

c6  (7-61) 

s:   0-6l 

0-^9 

50-61  void 

s5  (0-53) 

c:  0-53 

(50) 

c5  (O-49) 

2 

s4  (9-61) 

s:  0-53 

0-61 

5^-6l  out 

ck   (7-61) 

c:  O-53 

(62) 

s3  (0-6l) 

1 

s2  (0-53) 

s:  0-53 

0-53 

All  digits  out 

c2  (0-53) 

c:  0-52 

W 

c3  (0-53) 

Total  adder   steps:      7V7 
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V,    Circuitry 

The  suggested  circuits  employ  diode  OR-AND  logic  and  transistor 
inverting  amplifiers,,  run  saturated.   Figure  2  shows  a  possible  circuit  for 
a  full-adder  stage.   It  is  designed  to  he  fed  from,  and  to  feed,  exactly 
complementary  circuits  using  npn  transistors  with  their  emitters  tied  to 
-2  volts  and  collectors  caught  at  +0.5  volts.   Either  output  can  drive  one 
input  of  a  complementary  circuit.   Thus,  odd-numbered  levels  of  the  adder 
tree  would  employ  one  polarity  of  circuit,  and  even-numbered  stages  would 
employ  the  other.   The  circuit  shown  generates  the  complement  of  the  sum 
and  carry  outputs  as  normally  defined,,   However,  since  the  logical,  equations 
for  both  sum  and  carry  of  a  full  adder  are  self-dual,  the  same  circuit  con- 
figuration is  employed  in  both  varieties  of  circuit.   Another  way  of  looking 
at  the  polarity  of  signals  is  to  define  the  output  of  a  transistor  of  either 
sort  as  one  of  the  transistor  is  on,  in  which  case  the  circuit  shown  will 
give  true  signals  at  all  adder  tree  levels.   Notice  that  if  all  inputs  to  the 
adder  tree  are  zero  in  this  latter  convention,  all  transistors  of  the  tree 
will  be  cut  off.   This  point  will  be  of  importance  to  the  discussion  of 
division. 

The  component  values  shown  guarant.ee  in  an  assumed  worst-case 
combination  of  ±3  per  cent  resistor  and  power  supply  variations,  base  turn- 
on  and  turn-off  currents  of  about  one-twelfth  of  the  maximum  collector  standing 
current.   The  use  of  modern  epitaxial  transistors  of  a.roung  $1  in  cost,  and 
of  diodes  around  $.30  cost  should,  with  reasonable  care  in  packaging,  give 
a  circuit  propagation  delay  of  about  30  nsec  per  transistor,  or  60  nsec  for 
a  full  adder. 

It  is  proposed  that  the  partial  products  be  generated  using 
OR-AND-NOT  circuits  of  the  same  general  type  as  used  in  the  full  adder.   One 
such  circuit,  and  hence  one  transistor,  would  be  required  for  each  digit  of 
each  of  the  twenty  partial  products,  a  total  of  800  transistors.  A  circuit 
producing  the  digit  of  weight  2"^  in  partial  product  W.  woula  produce  the 
function 

(p  .+1  -  72)  .   (p  .  v  Tl)    •      (p~  ^  ~)  •   (^  v  72) 

where  v.    are  the  digits  of  the  multiplicand,  numbered  from  in  order  of 
decreasing  significance,  and  the  signals  +2,  +1,  -1  and  -2  are  the  signals 
generated  by  receding  the  multiplier  digits  as  described  in  section  3  to 
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give  the  multiple  for  W. .   The  recoded  multiplier  digit  zero  can  be  obtained 
by  making  signals  +1  and  -1  true  simultaneously.   Partial  products  generated 
for  introduction  to  level  seven  of  the  adder  tree  must  differ  in  polarity 
from  those  for  introduction  to  levels  four  or  six.   Complementary  forms  of 
the  selector  circuit  would  be  used  for  the  two  polarities. 

Each  recoded  multiplier  digit  signal  must  drive  ^0  inputs,  as 
must  each  polarity  of  multiplicand  digit.   Thus,  160  driver  circuits  of  about 
200  ma.  capacity  are  required.   Modern  silicon  epitaxial  transistors  can  be 
used  to  make  such  drivers  with  delay  times  of  about  30  nsec.   However,  in 
view  of  the  possibly  large  spatial  fanout  of  these  signals,  a  delay  time 
estimate  of  100  nsec  might  be  more  realistic. 

The  logical  conditions  for  the  receded  multiplier  digit  signals 
would  each  be  generated  from  an  OR -AND- NOT  single  transistor  circuit  forming 
the  input  to  the  associated  driver.   Eighty  such  circuits  are  needed. 

The  circuitry  required  for  the  79 -digit  carry-propagation  adder 
will  not  be  discussed.   General,  designs  nave  appeared  in  the  literature 
capable  of  performing  the  carry  propagation  in  time  of  the  order  of  100  nsec. 

.It  should  be  noted  that  of  the  79  digital  positions,  only  the 
propagation  time  over  the  most -significant  5^  will  be  additive  to  the 
propagation  delay  through  the  a.dder  tree. 


VI.   The  Possibilities  for  Division 

Although  a.  case  can  be  made  for  the  use  of  a,  structure  of  the  form 
described  solely  for  the  purpose  of  multiplication,  it  is  of  interest  to  see 
whether  it  can  be  used  to  execute  a  reasonably  rapid  division  when  suitably 
augmented,   The  author  has  been  unable  to  discover  any  very  effective  method 
for  direct  division  in  the  multiplier.   However,  it  appears  possible  to  use  it 
in  a  four-step  process  to  obtain  the  reciprocal  of  a  40-digit  number,, 

If  the  multiplier  structure  is  to  be  used  efficiently,  advantage 
must  be  taken  of  its  ability  to  sum  many  numbers  simultaneously  and  rapidly. 
In  normal  division  processes,  the  usual  direction  taken  to  accelerate  the 
process  is  to  inspect  the  more  significant  digits  of  partial  remainder  (or 
dividend)  and  divisor,  and  to  guess  on  the  basis  of  these,  the  next  few 
quotient  digits.   The  product  of  the  guessed  digits  and  the  divisor  is  then 
formed,  possibly  using  simultaneous  addition  and  recoding  of  the  guessed 
quotient  digits,  and  subtracted  from  the  partial  remainder  to  give  a  new 
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partial  remainder.   This  method  can  in  principle  be  carried  to  whatever  extent 
desired,  but  the  logic  required  for  guessing  quotient  digits  becomes  rapidly  more 
complex  as  the  number  of  digits  guessed  per  step  increases.   The  practical  limit 
is  probably  not  much  more  than  six  quotient  digits  per  step.   The  partial  re- 
mainders may  be  left  in  a  carry-unassimilated  form  to  save  time,  but  this  con- 
siderably complicates  the  circuits  required  for  guessing..   In  any  case,  the 
guesses  can  never  be  always  correct,  so  the  quotient  must  normally  be  developed  in  ' 
a  redundant  form,  e.g.,  as  two  numbers  which  must  be  summed  to  give  the  final 
quotient .   Excellent  though  this  method  and  its  variants  may  be  for  conventional 
arithmetic  units,  it  does  not  seem  feasible  to  the  author  to  extend  it  to  the 
point  where  it  would  make  good  use  of  the  a.dder  tree. 

The  proposed  method  is  essentially  based  on  the  following  iterative  division 
process:   Given  x  and  y,  to  divide  y  by  x,  set 

a-L  =  xp         b2  =  yp 
tfhere  p  is  some  approximation  to  the  reciprocal  of  x,  and  iterate 

an+l  =  an(2  "  an}>  Vl  =  V2  "  *J 

Phis  process  converges  quadratic ally,  ^   to  one,  and  bn  to  the  required  quotient. 

is,  the  number  of  correct  digits  in  bR+1  is  double  that  in  b  .   If  p  is 
efficiently  good  an  approximation  to  l/x  that  xp  differs  from  one  by  2"5  at  most, 
:hen  three  of  the  repetitive  steps  will  give  a  2*0 -bit  quotient.,   The  part  of  the 
liird  step  which  generates  a.  is  not  needed,. 

The  values  of  (2  -  a.J  used  at  each  step  need  not  be  exact,  provided  that 

;he  same  value  is  used  to  form  both  a    and  b 

n+1      n+1 ' 

11  -     The  Iterative  Division 

For  the  moment,  consider  only  the  part  of  the  iteration  involved  in 
nning  the  a^.   This  part  is  independent  of  the  formation  of  b^   We  will  assume 
hat  the  divisor,  x,  is  positive  and  normalized  to  lie  in  the  range  1/2  <  x  <  1, 
y  inspection  of  the  first  seven  digits  of  x  following  the  binary  points,  the 
PProximation  p  will  be  generated,.   If  x  has  the  foj 


)rm 


O.labcdef , 


1  p  is  given  the  form  l.qrst,  then  suitable  expressions  for  the  digits  of  p  are 
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q  =  ab  v  ac 

r  =  be  ^  abc  ^  acde  ^  bdef 

s  =  ace  v  abc  v  bde  v  abce  v  abed  ^  abed  v  abedf  v  a,bcef 

t  =  acd  v/  -bed  v  bde  v  bede  v  abc"e  v  bede  v  abdf  ^  aedf  v  abed 
^  abedf  ^  abede  v  abedf 

These  expressions  are  essentially  raw  minterm  forms  and  may  not  be 
minimal.   However,  even  as  they  stand,  they  could  be  realized  quite  cheaply 
and  quickly  with  diode  logic. 

The  values  of  p  yielded  by  these  expressions  are  such  that,  xp  always 
has  one  of  the  forms 

0.11111. . . 

or 

1. 00000. . . 

(The  digit  in  position  zero  should  be  interpreted  with  positive  weight.) 

The  set  of  p  values  chosen  are  not  unique  in  having  this  property 
but  they  appear  to  require  the  simplest  logical  expressions  for  their 
generation. 

The  first  step  of  the  process  will  consist  of  forming  p,  receding  it 
to  give  three  partial  products  each  either  -2,  -l,  0,  +1  or  +2  times  x,  and 
summing  these.   (it  may  possibly  be  advantageous  to  generate  the  receded 
multiplier  digits  directly  from  the  digits  of  x.) 

Only  digital  positions  5  et  seq  of  the  product  need  be  explicitly 


formed. 


The  next  step  of  the  process  should  be  to  take  a  as  formed  and 
multiply  it  by  (2  -  &1),  with  the  aim  to  producing  as  ^   a  number  with  digits 
1  to  10  all  complement  of  digit  zero.   This  aim  can  be  achieved  by  using  a 
multiplier  which  approximates  (2  -  e^),  but  which  has  many  fewer  significant 
digits. 
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Consider  a,  number  of  the  form  of  a  ,  viz., 
P  "  PPPPPqrstuv. . . 

and  the  following  approximation  to  2  -  a  : 

1-2   (p  ■  qrstu)  where  the  number  in  brackets  is  interpreted 
as  a  signed  two's  complement  fraction,  hereafter  called  d,  in  the  range 
-1  to  1  -  1/32 .   We  may  write  a,±   as  1  +  2~5d  +  e,  where  e  is  in  the  range 
0  <  e  <  2    .   The  product  of  the  two  numbers  will  be 


1  -  2_10d2  +  e(l  -  2"5d) 

This  number  will  have  a  minimum  value,  when  e  =  0  and  d  =  -1,  of  1  -  2~10 
and  a  maximum  value,,  when  e  is  just  less  than  2~±0  and  d  =  0,of  just  less  than 
1+2    .   (Although,  with  e  at  its  maximum  value,  differentiation  of  the 
above  expression  with  respect  to  d  would  give  a  stationary  value  with  d 
slightly  negative;  in  fact  the  smallest  negative  allowable  value  for  d,  viz., 
-1/32,  is  already  past  the  stationary  point  and  yields  a-  value  for  the  product 
of  1  +  e  -  2  '•  (2  ■   -  e)  which  is  slightly  less  than  the  value  quoted  above.) 

Thus  the  approximation  to  (2  -  a^  given  above  always  yields  a 
product  of  the  desired  form,  differing  from  one  by  an  amount  in  the  range 
-2    to  just  under  2~ 

Thus  this  multiplier,  which  can  be  receded  to  give  four  partial 

products,  is  used  in  the  second  step  of  the  iteration  to  eive  a 

'2" 

Similarly,  the  number  formed  by  adding  one  to  2~10  times  the 

signed  two's -complement  fraction  represented  by  digits  10  to  20  of  a  is  an 

adequate  multiplier  for  use  in  step  three.   It  may  be  recoded  to  yield  seven 

partial  products,  and  will  give  an  a  whose  digits  1  to  20  will  all  differ 

from  its  digit  zero,. 

The  multiplier  for  step  four  will  be  one  plus  2~20  times  the  signed 
two's-complement  fraction  represented  by  digits  20  to  k0   of  a  .   However,  no 
a,^  will,  he  generated. 

In  forming  b^,  the  final  answer,  we  could  start  with  the  dividend 
y  and  multiply  it  successively  by  the  four  multipliers  used  in  the  four  steps. 
This  would  require  four  multiplication  times  in  addition  to  the  three  needed 
'  form  a,1,  ag  and  a^,   If,  however,  we  instead  generate  the  reciprocal  of  x, 
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then  y  =  1  and  b  is  simply  p  which  is  available „  As  will  be  shown  below,  it 
is  possible  because  of  the  fact  that  b  is  then  a  number  of  only  five  digits, 
to  obtain  b  at  the  same  time  as  a  and  b  at  the  same  time  as  a  .   Thus,  b,  , 
the  reciprocal,  can  be  obtained  in  four  multiplication  times,  and  a  true  division 
can  be  done  in  five  multiplication  times,  a.s  opposed  to  seven.   Also,  the 
reciprocal,  yielded  as  a  byproduct  may  often  be  useful  to  the  programmer. 


VIII.  Detailed  of  Reciprocal  Generator 

Assume  the  existence  of  a,  register  R  having  digital  positions  R., 
where  i  =  0,  1,  2....   Initially,  the  positive  normalized  number  x  occupies 
R  to  RoQ=   Digits  R   -  R  are  decoded  to  give  p  and  hence  a  recoded  multiplier 
of  the  form  s  OqOr  where  s,  q  and  r  ta.ke  values  +2^  +1,  0,  -l,..-^.   Three 
partial  products  are  formed  by  selector  circuits  other  than  those  normally  used 
to  produce  W. .   These  selectors  introduce  their  output  words  into  the  adder 
three  at  the  points  labelled  on  Figure  1  a.s  B,  C  and  D.   If  the  normally 
used  multiplier  input  is  made  zero  during  this  and  subsequent  steps,  all 
transistors  of  the  adder  tree  above  the  levels  at  which  numbers  are  specifically 
introduced  will  be  off,  so  that  the  collectors  of  the  selector  circuits 
introducing  numbers  at.  these  poinds  may  simply  be  commoned  to  the  a.dder  tree 
collectors  normally  supply  these  points,.   We  will  have 

at  B:   (R    )  s  in  positions  0-37  (a,  two-pla.ce  left  displacement) 

at  C:   (R,  ~Q)  q  in  positions  1-39  ( nc  displacement) 

at  D°   (R    )  r  in  positions  3-^-1  (a  two-place  right  displacement) 

In  generating  these  partial  products,  correction  bits  are  applied  in  the 
usual  way.   That  for  the  entry  at  B  can  be  introduced  Into  the  appropriate 
digital,  position  at  point  A  of  Figure  1.   (if  x  is  positive,  s  will  never 
in  fact  be  negative.   However,  a  simple  way  to  produce  reciprocals  of 
negative  numbers  is  to  use  a  negative  multiplier  in  step  one.) 

If  the  output  of  the  adder  tree  is  now  gated  ba,ck  without  shifting 

into  R,  we  will  have  in  R,  .   digits  2-^3  of  a  ,  of  which  digits  2-5,  i.e., 

R„  _,  will  be  identical.   At  the  same  time  we  gate  digits  0  to  k   of  the 

multiplier  p  just  used  into  Rrn  „.   This  completes  step  one, 

51-55 

In  the  second  step,  R   q,  i.e.,  digits  s  to  10  of  a  ,  are  decoded 
to  give  a  recoded  multiplier  of  the  form  l.OOOOOcOdOe  giving  four  partial 
products  introduced  at  tne  adder-tree  points  A,  B,  C  and  D. 
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at  A:   R8-4l  in  P°sitions  0-33,  R52_55  in  positions  44-47 

(an  eight -place  left  displacement) 
at  B:  (K2_kl)    c  in  positions  0-34,  (X^_^)    c  in  positions  49-53 

(two -place  left  shift) 
at  C:  (Ro4l^  d  in  Portions  0-4l,  (*51_55)  d  in  positions  51-55 

(no  displacement) 
at  D:  (R0-4l)  e  in  Positions  2-43,  (R51_55)  e  in  positions  53-57 

(two -place  right  shift) 

Note  that  the  entries  to  B,  C  and  D  are  displaced  in  the  same  way  as  for 
step  one.   Thus  the  same  selector  inputs  may  be  used.   Carries  from  digital 
position  44  to  position  1+3  of  adder  words  1  and  2  of  the  adder  tree  are 
inhibited,  to  give  effectively  independent  multiplication  of  a  and  b 
by  the  same  multiplier.   The  b  part  of  the  entries  at  B,  0  an^D  musAe 
augmented  by  sign  digit  copies  as  far  left  as  position  44 . 

Gating  the  tree  output  back  into  R  gives  digits  10  to  53  of  a  in 
R0-43'  and  digits  ^^  °f  b2  in  \k_5T      Digit  0  of  bg  is  known  to  be  one, 
and  need  not  be  formed.   It  can  be  thought  of  as  occupying  an  additional 
flipflop  R^  for  the  discussion  of  later  steps. 

In  the  third  step,  digits  10-20  of  ag  in  EQ_1Q   are  decoded  to  give 
a  multiplier  of  the  form 

1 . OOOOOOOOOfOgO  hOiO jOk 

giving  seven  partial  products,  which  must  be  formed  in  specially-provided 
sectors  and  introduced  into  adder  tree  points  E,  F,  G,  H,  I,  J  and  K. 

StE:  R10-43  in  Positions  0-31,  ^3.^  in  positions  35-39 

StF:  R0-33  in  Positions  0-33,  \3_5?  in  positions  35-49 

StG"  Ro-3i  in  Positions  2-33,  R^3_57  in  positions  37-51 

StH:  R0-29  in  Positions  4-33,  \3_3J   in  positions  39-53 

StI:  R0-27  in  Positions  6-33,  R43_5?  in  positions  iu-55 

StJ:  R0-25  in  Positions  8-33,  \3_5?  in  positions  43-57 

StK:  R0-23  in  Positions  10-33,  \3_5?  in  positions  45-59 
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Carry  from  position  3^  to  position  33  is  inhibited  in  all  relevant  adder  words.   As 
described  above,  only  digits  10  onwards  of  b3  are  formed.   However,  if,  in  the  carry 
propagating  adder,  an  additional  10-digit  section  is  added  to  the  left  of  position  ' 
3k,    this  section  and  the  stage  in  position  3^  can  receive  digits  0-9  of  b  ,  i.e., 
RU3_52,  and  any  carry  (which  may  be  negative)  into  position  3^  generated  in  the  adder 
tree.   Since,  in  this  step,  positions  of  the  double-length  carry  propagation  adder 
to  the  right  of  position  59  are  not  used,  ten  of  these  could  be  switched  to  form  the 
required  additional  section. 

The  adder  output  thus  contains  digits  20  to  53  of  a  in  positions  0  to  33, 
and  digits  0  to  3^  of  b3  in  positions  25-59-   In  this  third  step  some  truncation  of 
a3  has  occurred.   However,  digits  of  b3  are  retained  as  far  right  as  digit  53,  and 
only  digits  20  to  ko   are  required  in  step  four.   Thus  the  truncation  error  introduced 
is  very  small. 


Digits  20  to  ko   of  a3  are  gated  into  RQ  2Q,    and  digits  0-3^  of  b  are  gated 
into  R25_59. 

In  the  fourth  and  final  step,  RQ_2Q  axe   decoded  to  give  a  recoded  multi- 
plier specifying  twelve  partial  products.   These  could  be  introduced  into  level  five 
of  the  adder  tree,  but  only  at  considerable  expense.   The  multiplier  is  therefore  used 
to  control  a  standard  multiplication  step,  using  little  or  no  additional  equipment 
to  produce  the  reciprocal,  b^   Thus  the  formation  of  a  reciprocal  requires  15  adder 
word  delays,  four  carry  propagating  addition  delays,  of  which  only  the  last  is  of  the 
normal  length,  and  four  recoding  and  selection  delays. 

If  the  range  of  digital  positions  of  the  numbers  involved  in  each  step  is 
examined  and  compared  with  the  list  of  adder  word  dimensions,  it  will  be  found  that 
all  numbers  will  fit  into  the  adder  word  dimensions  already  prescribed,  with  the 
exception  of  the  partial  products  input  in  step  3  at  points  I,  J  and  K  of  the  adder 
tree.   To  accommodate  these,  adder  word  k   must  be  extended  by  three  digital  positions 

its  left-hand  end.   The  reciprocal -forming  process  also  requires  an  additional 
eleven  words  of  selector  circuits,  with  drivers. 


IX, 


A  Square -Root  Pro cess 


In  iterative  procedure  for  generating  the  reciprocal  of  the  square  root 
*  a  number  very  similar  to  that  used  above  for  obtaining  the  reciprocal  is:   giver 


n  a. 


number  x,    and  an  approximation  p  to  the   reciprocal,  of  its   root,    set  a     =   xp2,    b     =  p 
2nd  iterate  1  '      X 

Vl   "   anfl  I  "  \  \)2~  \+l   '-  V1  I  -  \  a„) 
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The  process  converges  quadrat! cally,  a.n  to  1,    and  b   to  the  required 
reciprocal  root.   If  p  and  p2  are  provided  by  inspection  of  x, "eight  multiplications 
are  required  to  obtain  b^  which,  if  sp2  differs  from  one  by  less  than  l/32,  will  have 
about  kO   correct  digits,   It  is  possible  that  the  eight  multiplications  could  be  done 
in  six  steps,  by  a  partitioning  of  the  adder  tree  similar  to  that  suggested  above. 
Even  if  this  were  not  so,  it  would  still  be  quite  a.  rapid  process.   The  true  square 
root  can  of  course  be  obtained  from  its  reciprocal  by  multiplication  by  x. 

X.     Speed  a.nd  Cost 

For  the  limit  of  a  large  number  of  digits  per  word,  most  conventional  multi- 
plier structures  have  a  product  of  equipment  cost  and  multiplication  time  which  varies 

I  the  square  of  the  number  of  digits.   In  the  present  structure  the  equipment  cost 
varys  as  the  square  of  the  number  of  digits,  multiplication  time  varys  as  the  logarithm 
of  the  number  of  digits.   Thus,  the  present  structure  has  a  cost-time  product  increasing 
more  rapidly  with  increasing  word  length  than  that  of  conventional,  structures.   It  is 
therefore  less  efficient  in  the  long-word  limit.   However,  it  is  not  necessarily  as 
inefficient  as  one  might  suppose  for  the  proposed  word  length  of  kO   bits,  particularly 
the  context  of  existing  transistor  technology.   The  apparent  logarithmically  in- 
easing  inefficiency  is  a  reflection  of  the  fact  that,  while  the  multiplication  time 
lepends  upon  the  propagation  delay  of  signals  passing  through  the  logarithmically 
increasing  number  of  adder  tree  levels,  each  logical  element  of  the  structure  is  used 
only  once  during  the  multiplication  process.   Thus,  if  one  defines  the  useful  duty 
cycle  of  a  logical  element  as  the  ratio  between  its  propagation  delay  and  the  period 
between  meaningful  and  distinct  uses  of  its  output,  this  duty  cycle  is  in  the  present 
structure  logarithmically  decreasing  with  word  length.   However,  the  duty  cycle  in  the 
ase  of  W)  digits  is  about  l/l5,  which  is  not  greatly  below  the  upper  limit  set  by  the 
characteristics  of  the  circuits  used,,   Typical  transistor  circuits  having  propagation 

lays  of  15  to  30  nsec  are  very  difficult  to  operate  at  repetition  above  1.0  mc, 
especially  when  allowance  is  made  for  the  reliable  distribution  time  of  clock  and 
gating  signals,  and  for  the  settling  time  of  flip flops. 

The  equipment  requirements  of  this  structure  are  approximately  as  follows: 
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For  multiplication 

Circuit  Type 

Transistors 
(per  unit) 

Diodes 
(per  unit) 

Number 
units 

Total 
Transistors 

Total 
Diodes 

Full  adders 

3 

18 

750 

2,250 

13,500 

Selectors 

1 

13 

840 

840 

10,920 

Recoders 

4 

~  10 

80 

320 

800 

Multiplicand 

Drivers 

k 

3 

80 

320 

240 

Totals  for  multiplication: 

3,730 

25,460 

Additional  re 

quirements  for  division: 

Recoders 

4 

10 

15 

60 

150 

Multiplicand 

Drivers 

4 

3 

60 

240 

180 

Selectors 

1 

13 

56l 

561 

7,293 

Grand  Totals: 

4,  591 

33,083 

Not  included  in  the  above  estimates  are  the  carry  propagating  adder, 
the  registers  necessary  to  hold  operands  and  results,  and  the  control  circuitry. 
This  equipment  would  almost  certainly  be  present  in  the  computer  arithmetic 
unit  for  addition-subtraction,  and  should  not  be  charged  specifically  to  the 
multiplier-divider.   That  is,  the  totals  above  represent  the  additional  equipment 

equired  for  the  proposed  multiply-divide  scneme  over  and  above  that  necessary 
for  even  the  most  primitive  parallel  arithmetic  unit.   The  additional  equipment 
is  perhaps  ten  per  cent  of  the  semiconductor  complement  of  a  modern,  large- 
scale  computer,  but  would  almost  certainly  represent  much  less  than' ten  per  cent 
of  the  cost  of  a  large  computer. 

To  estimate  the  time  required  for  a  multiplication,  it  is  assumed  that 

i)    The  propagation  delay  per  transistor  is  30  nsec;  the  delay 
per  adder  tree  level  accordingly  is  60  nsec. 

ii)   The  propagation  delay  of  the  high-current  drivers  is  100  nsec 

ill)  The  settling  time  of  the  carry-propagating  adder  is  100  nsec. 

iv)   The  result  will  be  gated  into  a  register  with  a  gating  time 


of  100  nsec 
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On  this  basis,  the  multiplication  time  becomes  750  nsec.  This 
should  be  a  fairly  conservative  estimate,  the  circuit  delays  being  those 
obtainable  at  reasonable  cost  in  1962  using  readily  available  components. 

On  the  same  basis,  the  time  required  for  the  generation  of  a 
reciprocal,  excluding  the  pre-normalization  time,  is  2220  nsec.   The  time  for 
a  full  division  is  therefore  about  3  usee. 


XI.   A  Simpler  Version 

If  a  kO   x  kO   bit  multiplication  is  performed  in  two  steps,  the  adder 
tree  technique  described  above  can  be  used  in  a  simpler  form.   In  the  first 
step,  the  22  least-significant  multiplier  digits  would  be  recoded  to  yield  11 
partial  products.   A  carry  can  arise  in  the  recoding  process  to  be  incorporated 
in  the  recoding  of  remaining  multiplier  digits  in  the  second  step.   An  a.dder 
tree  of  five  levels  containing  nine  adder  words  is  used  to  reduce  the  11 
partial  products  to  a  sum  word  and  a  carry  word  having  digits  in  positions 
17  to  78  (approximately)..   Of  these  words,  digits  in  positions  57  to  78  are 
final,  and  can  be  added  in  carry -propagating  adder  to  give  digits  of  the 
final  product.   The  remaining  digits  of  both  words,  together  with  the  nine 
partial  products  formed  by  recoding  digits  0-17  of  the  multiplier,  are  summed 
in  the  same  tree  with  the  output  words  added  in  the  carry -propagating  adder  to 
yield  the  rest  of  the  x  product,, 

The  equipment  cost  of  this  scheme  would  be  about  half  thai  of  the 
one  described  above,  and,  making  the  same  assumptions  as  above,,  the  multi- 
plication time  would  be  about  1.2  usee.   The  time  for  reciprocal  formation 
remains  unchanged,  since  by  increasing  the  number  of  adders  in  the  tree  to 
ten,  the  tree  can  be  made  capable  of  all.  operations  required  by  the  process 
described,   However,  the  time  required  for  a  division  is  increased  by  the 
increase  in  the  multiplication  time  to  about  3.5  usee.   Such  a.  scheme  might 
well  be  attractive  in  some  circumstances. 


XII „   Conclusion 

A  method  of  performing  multiplications  has  been  described  using  a 
large  amount  of  equipment  to  produce  the  product  In  a  one-step  combinatorial 
manner.   Although  in  principle  rather  inefficient,  this  process  is  reasonably 
well  ma.tched  to  the  characteristics  of  saturating  diode -transistor  circuitry, 
and  the  considerable  increase  it  could  yield  in  the  overall  speed  of  a  large 
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computer  might  well  justify  its  cost.   A  four-step  method  for  obtaining 
reciprocals  can  he  employed  using  essentially  the  same  equipment  to  give 
a  fairly  rapid  substitute  for  division.   A  perhaps  slightly  more  efficient 
scheme  employing  a  little  more  than  half  as  much  equipment  can  multiply 
in  a.  little  less  than  twice  the  time  required  by  the  more  expensive  scheme, 
using  two  steps.   This  cheaper  version  is  as  fast  as  the  more  expensive 
when  generating  reciprocals „ 
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